Word Sense Selection in Texts An Integrated Model
نویسنده
چکیده
Early systems for word sense disambiguation WSD often depended on individual tailor made lexical resources hand coded with as much lexical information as needed but of severely limited vocabulary size Recent studies tend to extract lexical informa tion from a variety of existing resources e g machine readable dictionaries corpora for broad coverage However this raises the issue of how to combine the information from di erent resources Thus while di erent types of resource could make di erent contribution to WSD studies to date have not shown what contribution they make how they should be combined and whether they are equally relevant to all words to be disambiguated This thesis proposes an Integrated Model as a framework to study the inter relatedness of three major parameters in WSD Lexical Resource Contextual Information and Nature of Target Words We argue that it is their interaction which shapes the e ectiveness of any WSD system A generalised structurally based sense mapping algorithm was designed to com bine various types of lexical resource This enables information from these resources to be used simultaneously and compatibly while respecting their distinctive structures In studying the e ect of context on WSD di erent semantic relations available from the combined resources were used and a recursive ltering algorithm was designed to overcome combinatorial explosion We then investigated from two directions how the target words themselves could a ect the usefulness of di erent types of knowledge In particular we modelled WSD with the cloze test format i e as texts with blanks and all senses for one speci c word as alternative choices for lling the blank A full scale combination of WordNet and Roget s Thesaurus was done linking more than senses Using these two resources in combination a range of dis ambiguation tests was done on more than noun instances from corpus texts of di erent types and blanks from real cloze texts Results show that combining resources is useful for enriching lexical information and hence making WSD more e ective though not completely Also di erent target words make di erent demand on contextual information and this interaction is closely related to text types Future work is suggested for expanding the analysis on target nature and making the com bination of disambiguation evidence sensitive to the requirements of the word being disambiguated
منابع مشابه
An Integrated Tool for Translation-Memory Maintenance
This paper presents an integrated tool to construct and maintain translation-memory for memory-based machine translation. This tool was aimed to automate constructing and validating translation-memory both in word and in phrase levels from English-Thai parallel texts. To align English-Thai words and phrases, the crucial problems that must be resolved include multiple-word-expression boundary am...
متن کاملAN INTEGRATED FIS-QFD MODEL FOR EVALUATION OF INTERNET SERVICE PROVIDER
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...
متن کاملAN INTEGRATED FIS-QFD MODEL FOR EVALUATION OF INTERNET SERVICE PROVIDER
<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...
متن کاملWord Sense Disambiguation: Why Statistics When We Have These Numbers?
Word sense disambiguation continues to be a di cult problem in machine translation (MT). Current methods either demand large amounts of corpus data and training or rely on knowledge of hard selectional constraints. In either case, the methods have been demonstrated only on a small scale and mostly in isolation, where disambiguation is a task by itself. It is not clear that the methods can be sc...
متن کاملThe evolution of the meaning of the word nurse based on the classical texts of Persian literature
Background and Aim: The semantic evolution of a word over time is inevitable, indicating a social, political, religious or cultural process. Nurse is one of the words that has a significant presence in Persian literature texts and has been used in many different meanings such as slave, servan, maid, devotee, obedient, patient and preserver. The purpose of this study is to show its semantic ev...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000